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We illustrate for 4D SU{2) and ?7(1) lattice gauge theory that sampling with a biased Metropolis 
scheme is essentially equivalent to using the heat bath algorithm. Only, the biased Metropolis 
method can also be applied when an efficient heat bath algorithm does not exist. For the examples 
discussed the biased Metropolis algorithm is also better suited for parallelization than the heat bath 
algorithms. 
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I. INTRODUCTION 

The possibility of constructing biased Metropolis algo¬ 
rithms (BMAs) is known since quite a while [1]. Although 
they have occasionally been used in the statistical physics 
[2] and bio-chemical [3] literature, it appears that practi¬ 
tioners of Markov chain Monte Carlo (MC) simulations 
have not given this topic the attention which it deserves. 
Reasons for this seem to be that (a) general situations 
for which BMAs are of advantage have not been clearly 
identihed and (b) a lack of straightforward instructions 
about implementing such schemes. 

On the other hand, the heat bath algorithm (HBA) is 
one of the widely used algorithms for MC simulations. It 
updates a variable with the Gibbs-Boltzmann probability 
defined by its interaction with the rest of the system (an 
introduction to HBAs can, e.g., be found in Ref. [4]). 
But, there exist energy functions for which an efficient 
heat bath implementation does not exist. 

In this paper we show that a BMA similar to the one 
used for the rugged Metroplis method of Ref. [5], can 
be employed whenever one would normally think about 
constructing a HBA. When an efficient heat bath imple¬ 
mentation exists, the performance of the HBA and the 
BMA will practically be identical. However, the BMA 
can still be constructed when the inversion of the cumu¬ 
lative distribution, required for a HBA, is numerically so 
slow that it is not a suitable option. 

In the next section we illustrate our general observation 
for systems from lattice gauge theory. Our first example 
is 4D SU{2) lattice gauge theory for which the HBA was 
first introduced by Creutz [6] and improved in Ref. [7] 
and [8]. Our second example is 4D 17(1) gauge theory. 


II. BIASED METROPOLIS ALGORITHMS AND 
PURE LATTICE GAUGE THEORY 

The action which we consider is 


Ua = where the sum is over all pla- 

quettes of a 4D simple hypercubic lattice, ii, ji, Z 2 and 
j 2 label the sites circulating about the plaquette and Uji 
is a U{1) or a SU{2) matrix {Nc = 1 or 2) associated 
with the link (ij). The reversed link is associated with 
the inverse matrix. The aim is to calculate expectation 
values with respect to the Euclidean path integral 

^ = / n (2) 

{ij) 


where the integrations are over the invariant group mea¬ 
sure. While working at a particular link (ij), we need 
only to consider the contribution to S, which comes from 
the staples containing this link. If we denote by Uu,k, 
k = 1,..., 6, the products which interact with the link in 
question, then the probability density of this link matrix 
is 


dP(U) ~ dU exp 


^ReTr 

Nc 



( 3 ) 


A. SU{2) 

We deal first with SU{2) and parametrize the matrix 
elements in the form 


U = aQl-\-id-a, aQ-|-a^ = l, (4) 

where I denotes the 2x2 identity matrix and u are the 
Pauli matrices. A property of SU{2) group elements is 
that any sum of them is proportional to another SU (2) el¬ 
ement. We define a SU{2) matrix Uu which corresponds 
to the sum of the staples in equation (3) by 


6 

Su Uu = Uu,k, Su 

k=l 


det 



( 5 ) 


S({U}) = ^Y.^eTi:(Uu) , 

^ □ 


( 1 ) 


Using the invariance of the group measure, one finds 
dP (U dUl dao exp ((3g Su ag) (6) 
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where dft is the differential solid angle of a. As it is 
straightforward to generate the solid angle stochastically, 
the problem is reduced to sampling the probability den¬ 
sity 

P{ao) ~ y 1 - exp {Pg su oo) (7) 

in the interval — 1 < oq < 1. This is the starting point 
for the HBA, which amounts to finding a numerically fast 
inversion of the cumulative distribution function 

F{ao) = No J da'o ^ 1 - aj, ^ exp [Pg Su Og) (8) 

where No ensures the normalization F{1) = 1. The HBA 
updates Og by converting a uniformly distributed random 
number 0 < a: < 1 into ag = 

The remark of our paper is that a crude tabulation of 
the function F(ag) is entirely sufficient to obtain prac¬ 
tically the same efficiency as with the HBA. Obviously, 
such a tabulation can still be done when there is no nu¬ 
merically efficient way to calculate F~^{x). The proce¬ 
dure does still generate the canonical probabilities of the 
continuous theory (2) without any approximation (ex¬ 
cept by the floating point precision and limitations of 
the random number generator). 

Let us show how this works. First we choose a dis¬ 
cretization of the parameter su, 0 < Su < 6, into m 
discrete values Sy, i = 1,..., m so that 

0 < si < < ... < s™ (9) 

holds. We take these values equidistant. Other partitions 
work too and could be more efficient. For each Sy we 
calculate a table of values , j = 1,... ,n defined by 

^=A(4’^;si) (10) 

and we also tabulate the differences 

Aug’-^ = Og"^ - Og^”^ for j = l,...,n (11) 

where we define Og^ = —1, and Og” = -1-1 follows from 
Eq. (10). For Pg = 2.3 this construction is shown in 
Fig. 1 using a representative Sy value. 

The biased Metropolis procedure for one update of a 
SU{2) matrix consists now of the following steps: 

1. Find the Sy value (only i is needed) which is nearest 
to the actual sy value. 

2. Place the present og value on the discretization 
grid, i.e., find the integer j through the relation 
flg < Oo < ad ■ 

3. Pick an integer value j' uniformly distributed in the 
range 1 to n. 

4. Propose Og = + x*" Aog^ , where x*", 0 < 

x*" < 1, is a uniformly distributed random number. 



FIG. 1. Discretization of the cumulative distribution func¬ 
tion F{ao-,sl}) for SU{2) at Pg — 2.3 for the choices m = 16 
(equidistant sh values, i.e., = 3.9375) and n = 2^ = 16. 

5. Accept a'o with the probability 

^ exp(/3gsya()) Aog’^ 
exp(/3gSyag) Aag’-^ 

6. If a'o is accepted, calculate a random value for a' 
with the measure d^2 and store the new SU (2) ma¬ 
trix. Otherwise keep the old SU{2) matrix. After 
this step the configuration has to be counted inde¬ 
pendently of whether a'o was accepted or rejected. 

For i given each interval on the ag abscissa of Fig. 1 is 
proposed with probability 1/n. In the limit n > to, m —> 
oo these are by construction the heat bath probabilities, 
so that the acceptance rate becomes one. For a reason¬ 
ably accurate discretization the algorithm is still exact 
due to the factor Aoq^ !ha'd^ in the acceptance proba¬ 
bility (12), and the acceptance rate remains close to one. 
Therefore, the relative efficiency of a HBA versus our 
BMA becomes to a large extent a matter of CPU time 
consumption. 

Only step 2 of the BMA procedure requires some 
thought, all others are straightforward numerical calcu¬ 
lations. For n = 2”^ the interval label j of the existing 
ag can be determined in steps using the binary search 
recursion 

j ^ j + sign (ag - a'd^ ), A ^ Z2 - 1 (13) 

where the starting values are A = n 2 — 2 and j = 2”^“^, 
and the termination is for A = 0 (after which one final 
logical decision has to be made). As long as a uniform 
discretization of Sy is chosen, there is no slowing down of 
the code with an increase of the size to of the table, while 
there is a logarithmic slowing down with an increase of 
the Aag"^ discretization. For the same choice of to and n 


2 






























a 


FIG. 2. Partition of the Aoq^ values for SU{2) at /?g = 2.3, 
where the variable a = (3g Su is used on the abscissa. The 
choices for m and n are the same as in Fig. 1. 

values as used in Fig. 1, the partition of all Aag-^ values 
is shown in Fig. 2. For each bin i on the abscissa the 
values are calculated for its central value a* = /3g Sy. For 
our simulations we used a finer discretization, m = 32 
and n = 128. 

Table I illustrates the performance of the SU{2) BMA 
for a long run on a 4 x 16^ lattice at /3g = 2.3. At this cou¬ 
pling the system exhibits critical slowing down, because 
of its neighborhood to the deconfining phase transition 
(see for instance [9] and references therein). Our com¬ 
parison is with the Fabricius-Haan-Kennedy-Pendleton 
HBA [7,8], which at this coupling is more efficient than 
Creutz’s HBA [6]. 

We used 16,384 sweeps for reaching equilibrium and, 
subsequently, 32 x 20,480 sweeps for measurements. Sim¬ 
ulations were performed on 2 GHz Athlon PCs with the 
-02 option of the (freely available) g77 Fortran compiler. 
Although our programs are not thoroughly optimized, we 
report the runtimes in table I, because we expect their 
ratios to be relatively stable under further optimization. 
(For instance, our runs were fully in real*8 precision. By 
reducing most of the code to real*4 a factor up to two 
might be gained.) 

It is the high acceptance rate of 97.5% which makes 
the BMA almost as efficient as the HBA. In standard 
Metropolis procedures one gets high acceptance rates 
only at the price of small moves, so that acceptance rates 
between 30% and 50% are optimal [4]. In our BMA the 
high acceptance rate is achieved by proposing with an 
approximation of heat bath probabilities for which the 
acceptance rate is 100%. So, an acceptance rate close to 
100% is best for the BMA. The accept/reject step cor¬ 
rects for the failure to approximate the heat bath prob¬ 
ability perfectly. 

Although the acceptance rate for the HBA is 100%, 
the SU{2) HBAs use in their inner loops a reject until 


TABLE I. Efficiency of the SU{2) algorithms on a 4 x 16® 
lattice at f3g = 2.3. For the same lattice size integrated au¬ 
tocorrelation times are also given at Pg = 2.2 and /3g = 2.4. 



HBA [7,8] 

BMA 

CPU time 

194,873 [s] 

199,244 [s] 

Acceptance rate 

1 (1.043 proposals) 

0.975 

(Tr([/o)/2) 

0.603147 (17) 

0.603111 (21) 

"Tint 

49.8 (3.5) 

48.2 (3.8) 

TintiP = 2.2) 

7.1 (0.3) 

8.9 (0.4) 

TintiP = 2.4) 

6.7 (0.4) 

7.0 (1.0) 


accepted (RUA) step. It should be noted that this is 
distinct from the accept/reject step of the BMA. Like in 
the original Metropolis method, the latter cannot be it¬ 
erated until accepted (compare, e.g., p.l37 of Ref. [4]). 
This would introduce an uncontrolled bias, which for the 
original Metropolis algorithm is towards too low energies. 
For the simulation of table I the RUA step of the HBA 
[7,8] needs in the average 1.043 iterations to generate the 
new oo value [10]. For small Pg values the number of it¬ 
erations goes up, so that the Creutz HBA becomes then 
more efficient than the HBA of Fabricius-Haan-Kennedy- 
Pendleton, see [8] for a detailed discussion. Indepen¬ 
dently of Pg the BMA acceptance rate stays always close 
to 100%. 

The difference between a RUA procedure and the ac¬ 
cept/reject step of a BMA becomes important for a 
(checkerboard) parallelization of the updating. While for 
a BMA the speed is uniform at all nodes, this is not the 
case for a RUA method, where all nodes have to wait 
until the last RUA step is completed. For large systems, 
the consequences would be disastrous, so that at the price 
of an arguably negligible bias workers tend to impose an 
upper limit on the number of RUA steps (say three for 
our SU{2) case). 

The integrated autocorrelation time Tint is a direct 
measure for the performance of an algorithm. The num¬ 
ber of sweeps needed to achieve a desired accuracy is 
directly proportional to Tint- Table I gives Tint for the 
Wilson plaquette together with the expectation value of 
this operator. Error bars are given in parenthesis and 
apply to the last digits. They are calculated with respect 
to 32 bins (jackknife bins in case of Tint), relying on the 
data analysis software of [4]. We see that the expectation 
values are well compatible with one another {Q = 0.18 
in a Gaussian difference test). For Tint we know that the 
HBA should give a slightly lower value than the BMA. 
That the Tint data at Pg = 2.3 table come out in the 
opposite order is attributed to a statistical fluctuation. 
This is confirmed by shorter runs which we performed at 
other Pg values, whose Tint results are also listed in the 
table. 
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FIG. 3. Discretization of the cumulative distribution func¬ 
tion F{(p\rl}) for U{1) at fig = 1.0 for the choices m = 16 
(equidistant rf values) and n = 2'^ — 16. 



a 


FIG. 4. Partition of the values for U{1) at Pg = 1.0, 

where the variable a = Pg Vu is used on the abscissa. The 
choices for m and n are the same a in Fig. 3. 


B. U{1) 

Next we consider the C/(l) gauge group. The “ma¬ 
trices” are then complex numbers on the unit circle, 
Uij = exp(z^y ), and the analogue of Eq. (5) becomes 

6 

ru , (14) 



led to the cumulative distribution function 


Fi {(f) = Ni r d(j)' (15) 

Jo 

where the normalization is Ei(27r) = 1 and the angle 
{(j) + (()u) mod(27r) will be stored. 

We test the performance of the U{\) BMA for a 4 x 16^ 
lattice at Pg = 1.0, again a coupling which puts the sys¬ 
tem close to the deconfining phase transition, which is 
weakly first order for U{1) (see for instance [11] and refer¬ 
ences therein). HBAs have been designed in Ref. [12,13]. 
Both HBAs rely on a RUA step, so that the remarks 
made in this connection for SU{2) apply also to C/(l). 
We have only tested the HBA of Ref. [13], which turns 
out to be about 20% slower than our BMA, while the inte¬ 
grated autocorrelation time is about 10% lower. Overall 
an advantage of 10% in favor of the C/(l) BMA, which 
re-iterates that HBAs and BMAs have about equal effi¬ 
ciency, when efficient HBAs exist. 

We compare the 17(1) BMA now with a conventional 
Metropolis algorithm, which proposes new angles uni¬ 
formly in the (entire) range [0, 27r). For the BMA we 
follow the same lines as previously for F(ao) of Eq. (8). 
Fig. 3 plots Fi{(j)) at Pg = 1.0 using a representative r(j 


TABLE II. Efficiency of the 17(1) algorithms on a 4 x 16® 
lattice at Pg = 1.0. 



Metropolis 

BMA 

CPU time 

84,951 [s] 

107,985 [s] 

Acceptance rate 

0.286 

0.972 

(cos pn) 

0.59103 (16) 

0.59106 (12) 

Tint 

341 (26) 

142 (10) 


value and Fig. 4 shows the entire tabulation Acff’P Ta¬ 
ble H summarizes the results. At /3g = 1 the acceptance 
rate of the standard Metropolis procedure is still about 
30%, so that a restriction of the proposal range to in¬ 
crease the acceptance rate is not warranted [4]. From the 
data of the table we conclude that the BMA improves the 
Metropolis performance at = 1 by a factor of about 
two. 

When comparing with a full-range Metropolis algo¬ 
rithm an upper bound on the improvement factor is given 
by the ratio of the acceptance rates, in the present case 
0.972/0.282 = 3.45. This applies also to comparisons of 
such Metropolis algorithms with HBAs, substituting then 
one for the acceptance rate. The bound will normally not 
be saturated, because rms deviations of the new variables 
from the old variables are smaller for a BMA or HBA 
than for a full-range Metropolis algorithm. Larger gains 
can be achieved when the Metropolis acceptance rates 
are small. For 17(1) this happens for Pg :§> I- 


III. SUMMARY AND CONCLUSIONS 

In summary, BMAs are an alternative to HBAs. BMAs 
work still in situations for which HBAs fail, because there 
is no efficient inversion of the cumulative distribution 
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function in question. In lattice gauge theory this is the 
case for some Higgs system and for actions which are 
non-linear in the Wilson plaquette operator (see, e.g., 
Ref. [14] and references therein). Obviously, similar situ¬ 
ations ought to exist for energy functions in many other 
fields. We leave it to the reader to identify whether her 
or his simulations would benefit from using a BMA. Fi¬ 
nally, let us mention that BMAs may be combined with 
overrelaxation moves [15-17] in the same way as one does 
for HBAs or standard Metropolis algorithms. 
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